The INLA Approach to Bayesian models

The Integrated Nested Laplace Approximation, or INLA, approach is a recently developed, computationally simpler method for fitting Bayesian models [(Rue et al., 2009, compared to traditional Markov Chain Monte Carlo (MCMC) approaches. INLA fits models that are classified as latent Gaussian models, which are applicable in many settings (Martino & Rue, 2010. In general, INLA fits a general form of additive models such as:

\(\eta = \alpha + \sum_{j=1}^{nf} f^{(j)}(u_{ij}) + \sum_{k=1}^{n\beta}\beta_k z_{ki} + \epsilon_i\)

where \(\eta\) is the linear predictor for a generalized linear model formula, and is composed of a linear function of some variables u, \(\beta\) are the effects of covariates, z, and \(\epsilon\) is an unstructured residual (Rue et al., 2009). As this model is often parameterized as a Bayesian one, we are interested in the posterior marginal distributions of all the model parameters. Rue and Martino (2007) show that the posterior marginal for the random effects (x) in such models can be approximated as:

\(\tilde{p}(x_i|y) = \sum_k \tilde{p}(x_i|\theta_k, y) \tilde{p}(\theta_k|y) \Delta_k\)

via numerical integration (Rue & Martino, 2007; Schrodle & Held, 2011a, 2011b). The posterior distribution of the hyperparameters (\(\theta\)) of the model can also be approximated as:

\(\tilde{p}(\theta | y)) \propto \frac{p(x, \theta, y)}{\tilde{p}G(x| \theta,y)} \mid _{x} = x^*(\theta)\)

, where G is a Gaussian approximation of the posterior and \(x^*(\theta)\) is the mode of the conditional distribution of \(p(x|\theta,y)\). Thus, instead of using MCMC to find an iterative, sampling-based estimate of the posterior, it is arrived at numerically. This method of fitting the spatial models specified above has been presented by numerous authors (Blangiardo & Cameletti, 2015; Blangiardo et al., 2013; Lindgren & Rue, 2015; Martins et al., 2013; Schrodle & Held, 2011a, 2011b), with comparable results to MCMC.

Below, I show examples of using INLA to fit Bayesian regression models for areal data from Texas counties and another example of using INLA to estimate multilevel models and perform small area estimation.

#library(rgdal)
library(spdep)
## Loading required package: sp
## Loading required package: Matrix
## Loading required package: spData
## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source'))`
library(RColorBrewer)
library(lattice)
library(INLA)
## This is INLA_17.06.20 built 2017-06-20 03:42:30 UTC.
## See www.r-inla.org/contact-us for how to get help.
library(tigris)
## To enable 
## caching of data, set `options(tigris_use_cache = TRUE)` in your R script or .Rprofile.
## 
## Attaching package: 'tigris'
## The following object is masked from 'package:graphics':
## 
##     plot
library(tidycensus)
library(ggplot2)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following object is masked from 'package:ggplot2':
## 
##     vars
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
files<-list.files("~/Google Drive/a&m_stuff/workshop_5_14_18/vita_stat/", pattern = "*.csv", full.names = T)
vital<-lapply(files, read.csv, header=T)
library(plyr)
## -------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## -------------------------------------------------------------------------
## 
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
df <- ldply(vital, data.frame)
df$cofips<-paste(substr(df$GISJOIN, 2,3), substr(df$GISJOIN, 5,7), sep="")
df<-df%>%
  filter(YEAR %in%2000:2007, STATEA==480)%>%
  mutate(births=AGWE001, deaths=AGWG001)%>%
  select(YEAR, cofips, births, deaths)
head(df)
##   YEAR cofips births deaths
## 1 2000  48001    673    626
## 2 2000  48003    210     94
## 3 2000  48005   1311    824
## 4 2000  48007    266    298
## 5 2000  48009     75     71
## 6 2000  48011     25     24
popurl<-url("http://www2.census.gov/programs-surveys/popest/datasets/2000-2010/intercensal/county/co-est00int-tot.csv")
pops<-read.csv(popurl)
names(pops)<-tolower(names(pops))
pops<-pops%>%
  mutate(cofips = paste(sprintf(fmt = "%02d", state), sprintf(fmt = "%03d",county), sep=""))%>%
  filter(sumlev==50, state==48)

pops$struct<-1:dim(pops)[1]
pops.long<-reshape(data = pops, idvar = "cofips", varying = list(names(pops)[9:16]), direction="long", drop = names(pops)[c(2,3,4,5,6,8,17,18,19,20)], v.names = "population")
pops.long$year<-pops.long$time+1999
head(pops.long)
##         sumlev          ctyname cofips struct time population year
## 48001.1     50  Anderson County  48001      1    1      55062 2000
## 48003.1     50   Andrews County  48003      2    1      12949 2000
## 48005.1     50  Angelina County  48005      3    1      80270 2000
## 48007.1     50   Aransas County  48007      4    1      22452 2000
## 48009.1     50    Archer County  48009      5    1       8966 2000
## 48011.1     50 Armstrong County  48011      6    1       2166 2000
dat.long<-merge(pops.long, df, by.x=c("cofips", "year"), by.y=c("cofips", "YEAR"))

#v00<-load_variables(year=2000, dataset = "sf3", cache = T)
cov_dat<-get_decennial(geography = "county", state = "TX", year = 2000, sumfile = "sf3",
                       summary_var = "P001001",
                       variables = c("P007003", "P007004","P007010","P053001", "P089001", "P089002" ),
                      output = "wide")
## Getting data from the 2000 decennial Census
cov_dat<-cov_dat%>%
  mutate(cofips=GEOID,pwhite=P007003/summary_value, pblack=P007004/summary_value, phisp=P007010/summary_value,medhhinc=as.numeric(scale(P053001)), ppov=P089002/P089001)


final.dat<-merge(dat.long, cov_dat, by="cofips")

rates<-aggregate(cbind(deaths, births,population)~1, final.dat,sum)
rates$dr<-rates$deaths/rates$population
rates$br<-rates$births/rates$population


final.dat$E_d<-final.dat$population*rates$dr

final.dat$E_b<-final.dat$population*rates$br
final.dat<-final.dat[order(final.dat$cofips, final.dat$year),]
final.dat$id<-1:dim(final.dat)[1]

head(final.dat)
##   cofips year sumlev         ctyname struct time population births deaths
## 1  48001 2000     50 Anderson County      1    1      55062    673    626
## 2  48001 2001     50 Anderson County      1    2      54263    651    576
## 3  48001 2002     50 Anderson County      1    3      54740    712    616
## 4  48001 2003     50 Anderson County      1    4      56068    611    596
## 5  48001 2004     50 Anderson County      1    5      56245    670    597
## 6  48001 2005     50 Anderson County      1    6      56873    670    551
##   GEOID            NAME P007003 P007004 P007010 P053001 P089001 P089002
## 1 48001 Anderson County   34832   12849    6614   31957   40401    6654
## 2 48001 Anderson County   34832   12849    6614   31957   40401    6654
## 3 48001 Anderson County   34832   12849    6614   31957   40401    6654
## 4 48001 Anderson County   34832   12849    6614   31957   40401    6654
## 5 48001 Anderson County   34832   12849    6614   31957   40401    6654
## 6 48001 Anderson County   34832   12849    6614   31957   40401    6654
##   summary_value    pwhite    pblack     phisp    medhhinc      ppov
## 1         55109 0.6320565 0.2331561 0.1200167 -0.09855534 0.1646989
## 2         55109 0.6320565 0.2331561 0.1200167 -0.09855534 0.1646989
## 3         55109 0.6320565 0.2331561 0.1200167 -0.09855534 0.1646989
## 4         55109 0.6320565 0.2331561 0.1200167 -0.09855534 0.1646989
## 5         55109 0.6320565 0.2331561 0.1200167 -0.09855534 0.1646989
## 6         55109 0.6320565 0.2331561 0.1200167 -0.09855534 0.1646989
##        E_d      E_b id
## 1 382.8695 942.6150  1
## 2 377.3137 928.9368  2
## 3 380.6305 937.1027  3
## 4 389.8646 959.8369  4
## 5 391.0954 962.8670  5
## 6 395.4621 973.6178  6
options(scipen=999)

Next we make the spatial information, we get the polygons from census directly using counties from the tigris package.

us_co<-counties(state = "TX", cb = T)

#In INLA, we don't need FIPS codes, we need a simple numeric index for our counties
us_co$struct<-1:dim(us_co@data[1])
## Warning in 1:dim(us_co@data[1]): numerical expression has 2 elements: only
## the first used
nbs<-poly2nb(us_co, queen = T, row.names = us_co$struct)
mat <- nb2mat(nbs, style="B",zero.policy=TRUE)
colnames(mat) <- rownames(mat) 
mat <- as.matrix(mat[1:dim(mat)[1], 1:dim(mat)[1]])


nb2INLA("am_graph",nbs)
am_adj <-paste(getwd(),"/am_graph",sep="")
H<-inla.read.graph(filename="am_graph")
image(inla.graph2matrix(H), xlab="", ylab="", main="")

library(sf)
## Linking to GEOS 3.6.1, GDAL 2.2.3, proj.4 4.9.3
tx<-st_as_sf(us_co)
tx%>%
  ggplot()+geom_sf()

final.dat<-merge( tx,final.dat, by="struct")

Model setup

ggplot(data = final.dat)+geom_histogram(aes(x =deaths , y=0.5*..density..))+facet_wrap(~year)+
  ggtitle(label = "Distribution of Deaths by Year", subtitle = "Texas Counties, 2000-2007")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ggplot(data = final.dat)+geom_histogram(aes(x =deaths/E_d , y=0.5*..density..))+facet_wrap(~year)+
  ggtitle(label = "Distribution of Morality Relative Risk by Year", subtitle = "Texas Counties, 2000-2007")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

We can fit these model using the bayesian framework with INLA.

First, we consider the basic GLM for the mortality outcome, with out any hierarhical structure. We can write this model as a Negative Binomial model, for instance as:

\[\text{Deaths_ij} = \text{log(E_d)} + X' \beta\]

INLA will use vague Normal priors for the \(\beta\)’s, and we have not other parameters in the model to specify priors for. INLA does not require you to specify all priors, as all parameters have a default prior specification.

#Model specification:
f1<-deaths~scale(pblack)+scale(phisp)+scale(ppov)+year

#Model fit
mod1<-inla(formula = f1,data = final.dat, #linear predictor - fixed effects
           family = "nbinomial", E = E_d,  #marginal distribution for the outcome, expected count
           control.compute = list(dic=T), # compute DIC or not?
           control.predictor = list(link=1)) #estimate predicted values & their marginals or not?
#model summary
summary(mod1)
## 
## Call:
## c("inla(formula = f1, family = \"nbinomial\", data = final.dat, E = E_d, ",  "    control.compute = list(dic = T), control.predictor = list(link = 1))" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.2490          2.4679          0.1527          3.8696 
## 
## Fixed effects:
##                  mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)   16.7813 4.9367     7.0870  16.7815    26.4657 16.7822   0
## scale(pblack) -0.0858 0.0063    -0.0983  -0.0858    -0.0733 -0.0858   0
## scale(phisp)  -0.3171 0.0095    -0.3357  -0.3171    -0.2985 -0.3172   0
## scale(ppov)    0.2005 0.0093     0.1821   0.2005     0.2188  0.2005   0
## year          -0.0082 0.0025    -0.0130  -0.0082    -0.0034 -0.0082   0
## 
## The model has no random effects
## 
## Model hyperparameters:
##                                                         mean     sd
## size for the nbinomial observations (1/overdispersion) 18.54 0.6598
##                                                        0.025quant 0.5quant
## size for the nbinomial observations (1/overdispersion)      17.27    18.53
##                                                        0.975quant  mode
## size for the nbinomial observations (1/overdispersion)      19.86 18.51
## 
## Expected number of effective parameters(std dev): 5.192(0.0061)
## Number of equivalent replicates : 391.34 
## 
## Deviance Information Criterion (DIC) ...: 21420.19
## Effective number of parameters .........: 6.233
## 
## Marginal log-Likelihood:  -10747.31 
## Posterior marginals for linear predictor and fitted values computed

Plot our observed vs fitted values

plot(x= mod1$summary.fitted.values$mean, y=final.dat$deaths/final.dat$E_d , ylab="Observed", xlab="Estimated" )

Now we add basic nesting of rates within counties, with a random intercept term for each county. This would allow there to be heterogenity in the mortality rate for each county, over and above each county’s observed characteristics.

This model would be:

\[\text{Deaths_ij} = \text{log(E_d)} + X' \beta + u_j\] \[u_j \sim \text{Normal} (0 , \tau_u)\]

where \(\tau_u\) here is the precision, not the variance and precision = 1/variance. INLA puts a log-gamma prior on the the precision by default.

f2<-deaths~scale(pblack)+scale(phisp)+scale(ppov)+year+ #fixed effects
  f(struct, model = "iid")  #random effects
mod2<-inla(formula = f2,data = final.dat,
           family = "nbinomial", E = E_d, 
           control.compute = list(dic=T), 
           control.predictor = list(link=1))

#total model summary
summary(mod2)
## 
## Call:
## c("inla(formula = f2, family = \"nbinomial\", data = final.dat, E = E_d, ",  "    control.compute = list(dic = T), control.predictor = list(link = 1))" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.2208         16.6113          0.2227         18.0548 
## 
## Fixed effects:
##                  mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)   15.9592 1.2344    13.5361  15.9588    18.3825 15.9580   0
## scale(pblack) -0.0771 0.0176    -0.1117  -0.0772    -0.0425 -0.0772   0
## scale(phisp)  -0.2986 0.0254    -0.3484  -0.2986    -0.2486 -0.2987   0
## scale(ppov)    0.1806 0.0236     0.1341   0.1806     0.2269  0.1807   0
## year          -0.0078 0.0006    -0.0090  -0.0078    -0.0066 -0.0078   0
## 
## Random effects:
## Name   Model
##  struct   IID model 
## 
## Model hyperparameters:
##                                                           mean     sd
## size for the nbinomial observations (1/overdispersion) 1343.36 201.60
## Precision for struct                                     16.99   1.59
##                                                        0.025quant 0.5quant
## size for the nbinomial observations (1/overdispersion)     999.10  1324.76
## Precision for struct                                        14.04    16.93
##                                                        0.975quant    mode
## size for the nbinomial observations (1/overdispersion)    1790.78 1285.36
## Precision for struct                                        20.29   16.83
## 
## Expected number of effective parameters(std dev): 249.36(0.6697)
## Number of equivalent replicates : 8.149 
## 
## Deviance Information Criterion (DIC) ...: 17342.39
## Effective number of parameters .........: 250.06
## 
## Marginal log-Likelihood:  -9130.00 
## Posterior marginals for linear predictor and fitted values computed

Marginal Distributions of hyperparameters We can plot the posterior marginal of the hyperparameter in this model, in this case \(\sigma_u = 1/\tau_u\)

m2<- inla.tmarginal(
        function(x) (1/x), #invert the precision to be on variance scale
        mod2$marginals.hyperpar$`Precision for struct`)

inla.hpdmarginal(.95, marginal=m2)
##                   low       high
## level:0.95 0.04883561 0.07047514
plot(m2, type="l", main=c("Posterior distibution for between county variance", "- IID model -"), xlim=c(0, .1))

Observed vs. Fitted values

plot(x= mod2$summary.fitted.values$mean, y=final.dat$deaths/final.dat$E_d , ylab="Observed", xlab="Estimated" )
points(x= mod1$summary.fitted.values$mean, y=final.dat$deaths/final.dat$E_d, col=2)
legend("topleft", legend = c("GLM", "GLMM(IID)"), col=c(1,2), pch=1)

We see a couple of things here. First, we see much closer agreement between the observed and predicted values, and we see the shrinkage of the estimates toward the mean.

final.dat$fitted_m2<-mod2$summary.fitted.values$mean

final.dat%>%
  filter(year%in%c(2000))%>%
  mutate(qrr=cut(fitted_m2, breaks = quantile(fitted_m2, p=seq(0,1,length.out = 8))))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label="Relative Risk Quartile - IID Model, 2000")+coord_sf(crs = 102008)

final.dat%>%
  filter(year%in%c(2007))%>%
  mutate(qrr=cut(fitted_m2, breaks = quantile(fitted_m2, p=seq(0,1,length.out = 8))))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label="Relative Risk Quartile - IID Model, 2007")+coord_sf(crs = 102008)

library(mapview)

map1<-final.dat%>%
  filter(year%in%c(2007))%>%
  mutate(qrr=cut(fitted_m2, breaks = quantile(fitted_m2, p=seq(0,1,length.out = 8))))
clrs <- colorRampPalette(brewer.pal(8, "RdBu"))
mapView(as(map1, "Spatial"), zcol="qrr", legend=T, col.regions=clrs, map.types="OpenStreetMap")

Model with spatial correlation - Besag, York, and Mollie (1991) model and temporal heterogenity \[\text{Deaths_ij} = \text{log(E_d)} + X' \beta + u_j + v_j + \gamma_t\] Which has two random effects, one an IID random effect and the second a spatially correlated random effect, specified as a conditionally autoregressive prior for the \(v_j\)’s. This is the Besag model:

\[v_j|v_{\neq j},\sim\text{Normal}(\frac{1}{n_i}\sum_{i\sim j}v_j,\frac{1}{n_i\tau})\] and \(u_j\) is an IID normal random effect, \(\gamma_t\) is also given an IID Normal random effect specification, and there are now three hyperparameters, \(\tau_u\) and \(\tau_v\) and \(\tau_{\gamma}\) and each are given log-gamma priors.

For the BYM model we must specify the spatial connectivity matrix in the random effect.

#final.dat$year_c<-final.dat$year - 2004
f3<-deaths~scale(pblack)+scale(phisp)+scale(ppov)+
  f(struct, model = "bym", constr = T, scale.model = T, graph = mat)+
  f(year, model="iid") #temporal random effect
mod3<-inla(formula = f3,data = final.dat,
           family = "nbinomial", E = E_d, 
           control.compute = list(dic=T), 
           control.predictor = list(link=1))

#total model summary
summary(mod3)
## 
## Call:
## c("inla(formula = f3, family = \"nbinomial\", data = final.dat, E = E_d, ",  "    control.compute = list(dic = T), control.predictor = list(link = 1))" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.5383         15.3501          0.4151         17.3036 
## 
## Fixed effects:
##                  mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept)    0.3312 0.0171     0.2977   0.3312     0.3646  0.3312   0
## scale(pblack) -0.0772 0.0177    -0.1118  -0.0772    -0.0425 -0.0772   0
## scale(phisp)  -0.2985 0.0254    -0.3484  -0.2986    -0.2486 -0.2986   0
## scale(ppov)    0.1805 0.0236     0.1340   0.1805     0.2268  0.1805   0
## 
## Random effects:
## Name   Model
##  struct   BYM model 
## year   IID model 
## 
## Model hyperparameters:
##                                                           mean       sd
## size for the nbinomial observations (1/overdispersion) 1152.93  124.053
## Precision for struct (iid component)                     17.06    1.629
## Precision for struct (spatial component)               2605.79 2141.720
## Precision for year                                     2874.10 1518.094
##                                                        0.025quant 0.5quant
## size for the nbinomial observations (1/overdispersion)     953.83  1136.23
## Precision for struct (iid component)                        14.12    16.97
## Precision for struct (spatial component)                   335.39  2041.52
## Precision for year                                         920.94  2555.57
##                                                        0.975quant    mode
## size for the nbinomial observations (1/overdispersion)    1437.98 1091.44
## Precision for struct (iid component)                        20.52   16.77
## Precision for struct (spatial component)                  8222.00  962.02
## Precision for year                                        6710.02 1992.57
## 
## Expected number of effective parameters(std dev): 255.12(0.5095)
## Number of equivalent replicates : 7.965 
## 
## Deviance Information Criterion (DIC) ...: 17298.02
## Effective number of parameters .........: 240.30
## 
## Marginal log-Likelihood:  -8975.37 
## Posterior marginals for linear predictor and fitted values computed
plot(y=mod3$summary.random$year_c$mean,x=unique(final.dat$year), type="l")

m3a<- inla.tmarginal(
        function(x) (1/x),
        mod3$marginals.hyperpar$`Precision for struct (iid component)`)
m3b<- inla.tmarginal(
        function(x) (1/x),
        mod3$marginals.hyperpar$`Precision for struct (spatial component)`)
m3c<- inla.tmarginal(
        function(x) (1/x),
        mod3$marginals.hyperpar$`Precision for year`)

plot(m3a, type="l", main=c("Posterior distibution for between county variance", "- IID model -"), xlim=c(0, .1), ylim=c(0,300))
lines(m3b, col="red")
lines(m3c, col="green")

inla.hpdmarginal(.95,m3a)
##                   low       high
## level:0.95 0.04836997 0.07021438
inla.hpdmarginal(.95,m3b)
##                      low        high
## level:0.95 0.00005184317 0.002186784
inla.hpdmarginal(.95,m3c)
##                     low         high
## level:0.95 0.0001034234 0.0009307634

This indicates very low spatially correlated variance in these data and very low temporal heterogenity as well.

Space-time mapping of the fitted values

final.dat$fitted_m3<-mod3$summary.fitted.values$mean

final.dat%>%
  filter(year%in%c(2000))%>%
  mutate(qrr=cut(fitted_m3, breaks = quantile(fitted_m3, p=seq(0,1,length.out = 8))))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label="Relative Risk Quartile - IID Model, 2000")+coord_sf(crs = 102008)

final.dat%>%
  filter(year%in%c(2007))%>%
  mutate(qrr=cut(fitted_m3, breaks = quantile(fitted_m3, p=seq(0,1,length.out = 8))))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label="Relative Risk Quartile - IID Model, 2007")+coord_sf(crs = 102008)

library(mapview)

map1<-final.dat%>%
  filter(year%in%c(2007))%>%
  mutate(qrr=cut(fitted_m3, breaks = quantile(fitted_m3, p=seq(0,1,length.out = 8))))
clrs <- colorRampPalette(brewer.pal(8, "RdBu"))
mapView(as(map1, "Spatial"), zcol="qrr", legend=T, col.regions=clrs, map.types="OpenStreetMap")

Map of spatial random effects

It is common to map the random effects from the BYM model to look for spatial trends, in this case, there are not strong spatial signals:

tx$sp_re<-mod3$summary.random$struct$mean[1:254]
tx%>%
  mutate(qse=cut(sp_re, breaks = quantile(sp_re, p=seq(0,1,length.out = 8))))%>%
  ggplot()+geom_sf(aes(fill=qse))+scale_colour_brewer(palette = "RdBu" )+scale_fill_brewer(palette = "RdBu", na.value="grey")+guides(fill=guide_legend(title="Spatial Excess Risk"))+ggtitle(label="Spatial Random Effect - BYM Model")+coord_sf(crs = 102008)

Exceedence probabilities

In Bayesian spatial models that are centered on an epidemiological type of outcome, it is common to examine the data for spatial clustering. One way to do this is to examine the clustering in the relative risk from one of these GLMM models. For instance if \(\theta\) is the relative risk \[\theta = exp(\beta_0 + \beta_1*x_1 + u_j)\] from one of our Negative binomial models above. We can use the posterior marginals of the relative risk to ask \(\theta \gt \theta^*\) where \(\theta^*\) is a specific level of excess risk, say 50% extra or \(\theta > 1.5\). If the density, or \(\text{Pr}(\theta \gt \theta^*)\) is high, then there is evidence that the excess risk is not only high, but significantly high.

To get the exceedence probabilites from one of our models, we can use the inla.pmarginal() function to ask if \(\text{Pr}(\theta \gt \theta^*)\)

thetastar<-1.5#theta*
inlaprob<- unlist(lapply(mod3$marginals.fitted.values, function(X){
   1-inla.pmarginal(thetastar, X)
}))
hist(inlaprob)

So, we see lots of occasions where the exceedence probability is greater than .9. We can visualize these in a map.

final.dat$exceedprob<-inlaprob

final.dat%>%
  filter(year%in%c(2000))%>%
  mutate(qrr=cut(exceedprob, breaks = c(0, .5, .9, .95, .99, 1), include.lowest = T))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "Blues" )+scale_fill_brewer(palette = "Blues", na.value="grey")+guides(fill=guide_legend(title=""))+ggtitle(label=expression(paste("Exceedence Probability Relative Risk ","Pr( ",theta," >1.5"," )  - 2000") ))+coord_sf(crs = 102008)

final.dat%>%
  filter(year%in%c(2007))%>%
  mutate(qrr=cut(exceedprob, breaks = c(0, .5, .9, .95, .99, 1), include.lowest = T))%>%
  ggplot()+geom_sf(aes(fill=qrr))+scale_colour_brewer(palette = "Blues" )+scale_fill_brewer(palette = "Blues", na.value="grey")+guides(fill=guide_legend(title="Relative Risk Quartile"))+ggtitle(label=expression(paste("Exceedence Probability Relative Risk ","Pr( ",theta," >1.5"," )  - 2007") ))+coord_sf(crs = 102008)

library(mapview)

map1<-final.dat%>%
  filter(year%in%c(2007))%>%
  mutate(qrr=cut(exceedprob, breaks = c(0, .5, .9, .95, .99, 1), include.lowest = T))
  
clrs <- colorRampPalette(brewer.pal(6, "Blues"))
mapView(as(map1, "Spatial"), zcol="qrr", legend=T, col.regions=clrs, map.types="OpenStreetMap")

Which shows several areas of the state where risk the mortality rate is higher than the state rate.

Multi - Level Models

#load brfss
library(car)
library(knitr)
brfssurl<-"https://github.com/coreysparks/data/blob/master/brfss_14.Rdata?raw=true"
load(url(brfssurl))
set.seed(12345)
#samps<-sample(1:nrow(brfss_14), size = 90000, replace=F)
#brfss_14<-brfss_14[samps,]
#The names in the data are very ugly, so I make them less ugly
nams<-names(brfss_14)
#we see some names are lower case, some are upper and some have a little _ in the first position. This is a nightmare.
newnames<-gsub(pattern = "x_",replacement =  "",x =  nams)
names(brfss_14)<-tolower(newnames)
#BMI
brfss_14$bmi<-ifelse(is.na(brfss_14$bmi5)==T, NA, brfss_14$bmi5/100)
brfss_14$obese<-ifelse(brfss_14$bmi>30,1,0)
#Poor or fair self rated health
#brfss_14$badhealth<-ifelse(brfss_14$genhlth %in% c(4,5),1,0)
brfss_14$badhealth<-recode(brfss_14$genhlth, recodes="4:5=1; 1:3=0; else=NA")
#race/ethnicity
brfss_14$black<-recode(brfss_14$racegr3, recodes="2=1; 9=NA; else=0")
brfss_14$white<-recode(brfss_14$racegr3, recodes="1=1; 9=NA; else=0")
brfss_14$other<-recode(brfss_14$racegr3, recodes="3:4=1; 9=NA; else=0")
brfss_14$hispanic<-recode(brfss_14$racegr3, recodes="5=1; 9=NA; else=0")
brfss_14$race_eth<-recode(brfss_14$racegr3, recodes="1='nhwhite'; 2='nh black'; 3='nh other';
                         4='nh multirace'; 5='hispanic'; else=NA", as.factor = T)
brfss_14$race_eth<-relevel(brfss_14$race_eth, ref = "nhwhite")
#insurance
brfss_14$ins<-ifelse(brfss_14$hlthpln1==1,1,0)

#income grouping
brfss_14$inc<-ifelse(brfss_14$incomg==9, NA, brfss_14$incomg)

#education level
brfss_14$educ<-recode(brfss_14$educa, recodes="1:2='0Prim'; 3='1somehs'; 4='2hsgrad';
                     5='3somecol'; 6='4colgrad';9=NA", as.factor=T)
#brfss_14$educ<-relevel(brfss_14$educ, ref='0Prim')

#employment
brfss_14$employ<-recode(brfss_14$employ, recodes="1:2='Employed'; 2:6='nilf';
                       7='retired'; 8='unable'; else=NA", as.factor=T)
brfss_14$employ<-relevel(brfss_14$employ, ref='Employed')

#marital status
brfss_14$marst<-recode(brfss_14$marital, recodes="1='married'; 2='divorced'; 3='widowed';
                      4='separated'; 5='nm';6='cohab'; else=NA", as.factor=T)
brfss_14$marst<-relevel(brfss_14$marst, ref='married')

#Age cut into intervals
brfss_14$agec<-cut(brfss_14$age80, breaks=c(0,24,39,59,79,99), include.lowest = T)

I want to see how many people we have in each MSA in the data:

#Now we will begin fitting the multilevel regression model with the msa
#that the person lives in being the higher level
head(data.frame(name=table(brfss_14$mmsaname),id=unique(brfss_14$mmsa)))
##                                                          name.Var1
## 1                      Aberdeen, SD, Micropolitan Statistical Area
## 2             Aguadilla-Isabela, PR, Metropolitan Statistical Area
## 3                   Albuquerque, NM, Metropolitan Statistical Area
## 4 Allentown-Bethlehem-Easton, PA-NJ, Metropolitan Statistical Area
## 5                     Anchorage, AK, Metropolitan Statistical Area
## 6 Atlanta-Sandy Springs-Roswell, GA, Metropolitan Statistical Area
##   name.Freq    id
## 1       620 10100
## 2       544 10380
## 3      1789 10740
## 4      1095 10900
## 5      1785 11260
## 6      2776 12060
#people within each msa

#How many total MSAs are in the data?
length(table(brfss_14$mmsa))
## [1] 132
#counties

Higher level predictors

We will often be interested in factors at both the individual AND contextual levels. To illustrate this, I will use data from the American Community Survey measured at the MSA level. Specifically, I use the DP3 table, which provides economic characteristics of places, from the 2010 5 year ACS Link.

library(acs)
#Get 2010 ACS median household incomes for tracts in Texas
msaacs<-geo.make(msa="*")

acsecon<-acs.fetch(key=mykey, endyear=2010, span=5, geography=msaacs, variable = c("B19083_001","B17001_001","B17001_002", "B03002_001","B03002_004", "B03002_012" ))

colnames(acsecon@estimate)
## [1] "B19083_001" "B17001_001" "B17001_002" "B03002_001" "B03002_004"
## [6] "B03002_012"
msaecon<-data.frame(gini=acsecon@estimate[, "B19083_001"], 
ppoverty=acsecon@estimate[, "B17001_002"]/acsecon@estimate[, "B17001_001"],
pblack=acsecon@estimate[,"B03002_004"]/acsecon@estimate[, "B03002_001"],
phisp=acsecon@estimate[,"B03002_012"]/acsecon@estimate[, "B03002_001"],
giniz=scale(acsecon@estimate[, "B19083_001"]), 
ppovertyz=scale(acsecon@estimate[, "B17001_002"]/acsecon@estimate[, "B17001_001"]))
msaecon$zpblack<-scale(msaecon$pblack)
msaecon$zphisp<-scale(msaecon$phisp)
msaecon$ids<-paste(acsecon@geography$metropolitanstatisticalareamicropolitanstatisticalarea)

Let’s see the geographic variation in these economic indicators:

library(tigris)
msa<-core_based_statistical_areas(cb=T)
msa_ec<-geo_join(msa, msaecon, "CBSAFP", "ids", how="inner")

tx_ec<-msa_ec[grep(msa_ec$NAME, pattern = "TX"), ]
library(RColorBrewer)
library(sp)
spplot(tx_ec, "gini", at=quantile(tx_ec$gini), col.regions=brewer.pal(n=6, "Reds"), col="transparent", main="Gini Coefficient")

spplot(tx_ec, "phisp", at=quantile(tx_ec$phisp), col.regions=brewer.pal(n=6, "Reds"), col="transparent", main="Percent Hispanic")

Create spatial information for higher level units

#See what counties are in the brfss data
tx_ec$struct<-1:dim(tx_ec)[1]
city.est.dat<-tx_ec@data[,c( "giniz","ppovertyz", "zpblack", "zphisp", "struct")]
city.est.dat$obese<-NA

head(city.est.dat)
##         giniz   ppovertyz      zpblack    zphisp struct obese
## 7   1.6563723  2.68315742 -0.685922045 4.2102849      1    NA
## 10  0.6261542  1.95562295 -0.690319376 4.6557381      2    NA
## 17 -0.2752867  0.38023597 -0.312542768 2.3699550      3    NA
## 24  2.4290359  1.64782952  0.168694184 0.5513964      4    NA
## 28  0.5295712 -0.08529324 -0.244977156 2.3297027      5    NA
## 36  0.7227371 -0.25647600 -0.005019977 0.1616555      6    NA
brfss_14$cbsa<-as.character(brfss_14$mmsa)
indat<-merge(brfss_14, tx_ec, by.x="cbsa", by.y="CBSAFP", all.x=T)

brf.est<-indat[, c("giniz","ppovertyz", "zpblack", "zphisp", "struct", "obese")]
brf.est<-brf.est[order(brf.est$struct),]
head(brf.est)
##          giniz ppovertyz   zpblack    zphisp struct obese
## 49956 2.429036   1.64783 0.1686942 0.5513964      4     1
## 49957 2.429036   1.64783 0.1686942 0.5513964      4     1
## 49958 2.429036   1.64783 0.1686942 0.5513964      4     0
## 49959 2.429036   1.64783 0.1686942 0.5513964      4     0
## 49960 2.429036   1.64783 0.1686942 0.5513964      4     1
## 49961 2.429036   1.64783 0.1686942 0.5513964      4     0
##Here is where I add the cities that need to be estimated to the rest of the data
m.est<-rbind(city.est.dat, brf.est)

struct.in<-unique(brf.est$struct)
m.est$comp<-ifelse(m.est$struct%in%struct.in ,1,0)
m.est$rm<-ifelse(m.est$comp==1&is.na(m.est$obese)==T,1,0)

m.est<-m.est[-which(m.est$rm==1),]
m.est<-m.est[is.na(m.est$struct)==F,]
m.est<-m.est[order(m.est$struct),]

# 
# fake_dat<-expand.grid(race_eth=levels(brfss_14$race_eth), agec=levels(brfss_14$agec), CBSAFP=levels(as.factor(tx_ec$CBSAFP) ))
# fake_dat<-merge(fake_dat, tx_ec, by="CBSAFP")

library(spdep)
nbs<-knearneigh(coordinates(tx_ec), longlat = T, k = 4)
nbs<-knn2nb(nbs, row.names = tx_ec$struct, sym = T)
mat <- nb2mat(nbs, style="B",zero.policy=TRUE)
colnames(mat) <- rownames(mat) 
mat <- as.matrix(mat[1:dim(mat)[1], 1:dim(mat)[1]])
fit_est<- inla(obese~ giniz+zpblack+zphisp+
                 f(struct, model="bym", graph=mat,constr=TRUE,  scale.model=TRUE),
               family = "binomial",Ntrials = 1,
               data=m.est,  num.threads = 2,
               control.predictor = list(link=1))
#               control.inla=list(strategy='gaussian'))
#
summary(fit_est)
## 
## Call:
## c("inla(formula = obese ~ giniz + zpblack + zphisp + f(struct, model = \"bym\", ",  "    graph = mat, constr = TRUE, scale.model = TRUE), family = \"binomial\", ",  "    data = m.est, Ntrials = 1, control.predictor = list(link = 1), ",  "    num.threads = 2)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.3010        252.3749          0.4299        254.1058 
## 
## Fixed effects:
##                mean     sd 0.025quant 0.5quant 0.975quant    mode kld
## (Intercept) -1.0249 0.1523    -1.3008  -1.0342    -0.6955 -1.0497   0
## giniz        0.0827 0.0940    -0.1133   0.0858     0.2595  0.0907   0
## zpblack     -0.0280 0.1947    -0.4391  -0.0214     0.3465 -0.0113   0
## zphisp       0.0882 0.0608    -0.0397   0.0903     0.2039  0.0937   0
## 
## Random effects:
## Name   Model
##  struct   BYM model 
## 
## Model hyperparameters:
##                                             mean      sd 0.025quant
## Precision for struct (iid component)      394.55 1047.49      16.44
## Precision for struct (spatial component) 2131.41 2414.89     129.05
##                                          0.5quant 0.975quant   mode
## Precision for struct (iid component)       147.89    2296.16  39.30
## Precision for struct (spatial component)  1393.44    8492.95 342.89
## 
## Expected number of effective parameters(std dev): 5.788(0.6795)
## Number of equivalent replicates : 1450.66 
## 
## Marginal log-Likelihood:  -5125.44 
## Posterior marginals for linear predictor and fitted values computed
m.est$est.obese<-fit_est$summary.fitted.values$mean
m.est.est<-tapply(m.est$est.obese, m.est$struct, mean, na.rm=T)
m.est.est<-data.frame(struct=names(unlist(m.est.est)), obeseest=unlist(m.est.est))
#m.est<-m.est[is.na(m.est$bmi)==T,]
msa.est<-merge(tx_ec, m.est.est, by.y="struct", by.x="struct", all.x=T, sort=F)
head(msa.est@data)
##   struct CSAFP CBSAFP       AFFGEOID GEOID                          NAME
## 1      1   154  15180 310M300US15180 15180     Brownsville-Harlingen, TX
## 2      2  <NA>  29700 310M300US29700 29700                    Laredo, TX
## 3      3  <NA>  38380 310M300US38380 38380                 Plainview, TX
## 4      4  <NA>  17780 310M300US17780 17780     College Station-Bryan, TX
## 5      5  <NA>  41700 310M300US41700 41700 San Antonio-New Braunfels, TX
## 6      6  <NA>  48660 310M300US48660 48660             Wichita Falls, TX
##   LSAD       ALAND    AWATER  gini  ppoverty      pblack     phisp
## 1   M1  2307891059 998035120 0.492 0.3473913 0.002924541 0.8741380
## 2   M1  8706186702  36555096 0.460 0.2976883 0.002346617 0.9544532
## 3   M2  2602115387    246678 0.432 0.1900627 0.051996338 0.5423268
## 4   M1  5439580277  85909281 0.516 0.2766608 0.115243452 0.2144409
## 5   M1 18939926777 150513456 0.457 0.1582592 0.060876225 0.5350693
## 6   M1  6784786637 144176905 0.463 0.1465645 0.092412870 0.1441707
##        giniz   ppovertyz      zpblack    zphisp   ids  obeseest
## 1  1.6563723  2.68315742 -0.685922045 4.2102849 15180 0.3837053
## 2  0.6261542  1.95562295 -0.690319376 4.6557381 29700 0.3698816
## 3 -0.2752867  0.38023597 -0.312542768 2.3699550 38380 0.3041471
## 4  2.4290359  1.64782952  0.168694184 0.5513964 17780 0.3058289
## 5  0.5295712 -0.08529324 -0.244977156 2.3297027 41700 0.3056169
## 6  0.7227371 -0.25647600 -0.005019977 0.1616555 48660 0.2970790
library(mapview)
clrs <- colorRampPalette(brewer.pal(8, "Blues"))

mapview(msa.est,"obeseest", col.regions=clrs, map.types="OpenStreetMap")

Multi-level model in INLA

INLA is very good at doing multi-level models, and much faster than other modeling strategies. Plus we can build in the correlated random effects, such as the Besag model at the higher level of analysis. Below, we fit three models. A basic random intercept model: This corresponds to a multilevel logistic model with a higher level variables as predictors and can be written: \[logit(y_{ij}) = \beta_{0j} + \sum {\beta x_i}\]

\[\beta_{0j} = \beta_0 + u_j\]

with \[u_j \sim N(0, \tau_u)\]

library(INLA)
indat<-indat[is.na(indat$struct)==F,]
fit_in1<- inla(obese~ race_eth+agec+f(struct, model="iid"),
               family = "binomial", Ntrials =1,
               data=indat, num.threads = 2)
summary(fit_in1)
## 
## Call:
## c("inla(formula = obese ~ race_eth + agec + f(struct, model = \"iid\"), ",  "    family = \"binomial\", data = indat, Ntrials = 1, num.threads = 2)" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.4176          3.6431          0.0527          5.1133 
## 
## Fixed effects:
##                         mean     sd 0.025quant 0.5quant 0.975quant    mode
## (Intercept)          -2.0850 0.1451    -2.3750  -2.0834    -1.8043 -2.0800
## race_ethhispanic      0.5357 0.0577     0.4221   0.5358     0.6486  0.5360
## race_ethnh black      0.8600 0.0896     0.6838   0.8601     1.0357  0.8602
## race_ethnh multirace  0.2368 0.2268    -0.2188   0.2404     0.6727  0.2475
## race_ethnh other     -0.6385 0.1809    -1.0046  -0.6346    -0.2938 -0.6268
## agec(24,39            0.8669 0.1512     0.5756   0.8650     1.1691  0.8612
## agec(39,59            1.2868 0.1435     1.0114   1.2846     1.5748  1.2802
## agec(59,79            1.1912 0.1437     0.9154   1.1890     1.4797  1.1847
## agec(79,99            0.3099 0.1701    -0.0208   0.3088     0.6466  0.3066
##                      kld
## (Intercept)            0
## race_ethhispanic       0
## race_ethnh black       0
## race_ethnh multirace   0
## race_ethnh other       0
## agec(24,39             0
## agec(39,59             0
## agec(59,79             0
## agec(79,99             0
## 
## Random effects:
## Name   Model
##  struct   IID model 
## 
## Model hyperparameters:
##                          mean       sd 0.025quant 0.5quant 0.975quant
## Precision for struct 10259.58 16726.18      41.14  2168.30   59222.95
##                       mode
## Precision for struct 69.50
## 
## Expected number of effective parameters(std dev): 10.13(1.511)
## Number of equivalent replicates : 829.10 
## 
## Marginal log-Likelihood:  -5012.48
1/fit_in1$summary.hyperpar
##                               mean            sd 0.025quant     0.5quant
## Precision for struct 0.00009746986 0.00005978651 0.02431007 0.0004611908
##                         0.975quant       mode
## Precision for struct 0.00001688535 0.01438795
m<- inla.tmarginal(
        function(x) (1/x),
        fit_in1$marginals.hyperpar$`Precision for struct`)

inla.hpdmarginal(.95, marginal=m)
##                       low       high
## level:0.95 0.000007273682 0.02080729
plot(m, type="l", main=c("Posterior distibution for between MSA variance", "- Random Intercept model -"))

That is our individual level, random intercept model. Now I will fit a model that includes MSA level demographic characteristics, the Gini index, %black and %hispanic. This corresponds to a multi-level model with higher level predictors:

\[y_{ij} = \beta_{0j} + \sum {\beta x_i} + \sum {\gamma z_j}\]

\[\beta_{0j} = \beta_0 + \sum {\gamma z_j}+ u_j\]

with \[u_j \sim N(0, \tau_u)\]

fit_in2<- inla(obese~ race_eth+agec+giniz+zpblack+zphisp+f(struct, model="iid"),
               data=indat,
               family="binomial", Ntrials = 1)
summary(fit_in2)
## 
## Call:
## c("inla(formula = obese ~ race_eth + agec + giniz + zpblack + zphisp + ",  "    f(struct, model = \"iid\"), family = \"binomial\", data = indat, ",  "    Ntrials = 1)")
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.6174          4.3470          0.1090          6.0734 
## 
## Fixed effects:
##                         mean     sd 0.025quant 0.5quant 0.975quant    mode
## (Intercept)          -2.1662 0.1915    -2.5204  -2.1735    -1.7632 -2.1823
## race_ethhispanic      0.5260 0.0609     0.4065   0.5260     0.6453  0.5260
## race_ethnh black      0.8585 0.0900     0.6817   0.8586     1.0348  0.8587
## race_ethnh multirace  0.2383 0.2269    -0.2174   0.2418     0.6743  0.2490
## race_ethnh other     -0.6367 0.1809    -1.0029  -0.6329    -0.2920 -0.6250
## agec(24,39            0.8705 0.1512     0.5792   0.8686     1.1728  0.8648
## agec(39,59            1.2851 0.1435     1.0096   1.2829     1.5732  1.2785
## agec(59,79            1.1867 0.1439     0.9105   1.1846     1.4755  1.1802
## agec(79,99            0.2992 0.1703    -0.0320   0.2981     0.6364  0.2959
## giniz                 0.0893 0.0838    -0.0943   0.0942     0.2353  0.0993
## zpblack              -0.0641 0.1603    -0.4363  -0.0516     0.2030 -0.0412
## zphisp                0.0036 0.0513    -0.1146   0.0079     0.0892  0.0121
##                         kld
## (Intercept)          0.0000
## race_ethhispanic     0.0000
## race_ethnh black     0.0000
## race_ethnh multirace 0.0000
## race_ethnh other     0.0000
## agec(24,39           0.0000
## agec(39,59           0.0000
## agec(59,79           0.0000
## agec(79,99           0.0000
## giniz                0.0000
## zpblack              0.0001
## zphisp               0.0000
## 
## Random effects:
## Name   Model
##  struct   IID model 
## 
## Model hyperparameters:
##                         mean       sd 0.025quant 0.5quant 0.975quant  mode
## Precision for struct 8789.87 15052.74      16.47   825.98   53621.54 32.91
## 
## Expected number of effective parameters(std dev): 12.76(0.9582)
## Number of equivalent replicates : 658.05 
## 
## Marginal log-Likelihood:  -5029.88
1/fit_in2$summary.hyperpar
##                              mean            sd 0.025quant    0.5quant
## Precision for struct 0.0001137673 0.00006643307  0.0607064 0.001210679
##                         0.975quant       mode
## Precision for struct 0.00001864922 0.03038511
m2<- inla.tmarginal(
        function(x) (1/x),
        fit_in2$marginals.hyperpar$`Precision for struct`)

inla.hpdmarginal(.95, marginal=m2)
##                       low       high
## level:0.95 0.000008128605 0.04459525
plot(m2, type="l", main=c("Posterior distibution for between MSA variance", "- Multi-level model -"))

Finally, we model the county level means using the Besag-York and Mollie model that we had used on the areal data last week. In this example, we are modeling the correlation between MSAs at the higher level, instead of like last week, where we were modeling the correlation among neighboring counties.

This is a multi level model with correlation between neighboring MSAs, assuming a Besag, York and Mollie, 1991 convolution prior for the random effects \[y_{ij} = \beta_{0} +\sum {\beta x_i} + \sum {\gamma z_j} + u_j + v_j\]

\[\beta_{0j} = \beta_0 + \sum {\gamma z_j}+ u_j+v_j\]

with

\[u_j \sim N(0, \tau_u)\]

\[v_j|v_{\neq j},\sim\text{Normal}(\frac{1}{n_i}\sum_{i\sim j}v_j,\frac{1}{n_i\tau})\]

fit_in3<- inla(obese~ race_eth+agec+giniz+zpblack+zphisp+
                 f(struct, model="bym", graph=mat),
               family = "binomial", Ntrials = 1, 
               data=indat,
               control.predictor = list(link=1))

summary(fit_in3)
## 
## Call:
## c("inla(formula = obese ~ race_eth + agec + giniz + zpblack + zphisp + ",  "    f(struct, model = \"bym\", graph = mat), family = \"binomial\", ",  "    data = indat, Ntrials = 1, control.predictor = list(link = 1))" )
## 
## Time used:
##  Pre-processing    Running inla Post-processing           Total 
##          1.7982        330.3397          0.6054        332.7433 
## 
## Fixed effects:
##                         mean     sd 0.025quant 0.5quant 0.975quant    mode
## (Intercept)          -2.1123 0.2205    -2.5332  -2.1169    -1.6625 -2.1243
## race_ethhispanic      0.5265 0.0609     0.4069   0.5265     0.6460  0.5265
## race_ethnh black      0.8589 0.0900     0.6820   0.8589     1.0352  0.8590
## race_ethnh multirace  0.2387 0.2270    -0.2172   0.2422     0.6748  0.2493
## race_ethnh other     -0.6370 0.1809    -1.0033  -0.6331    -0.2921 -0.6253
## agec(24,39            0.8720 0.1512     0.5806   0.8701     1.1742  0.8662
## agec(39,59            1.2801 0.1435     1.0047   1.2779     1.5682  1.2736
## agec(59,79            1.1772 0.1438     0.9012   1.1751     1.4659  1.1707
## agec(79,99            0.2839 0.1702    -0.0470   0.2827     0.6209  0.2805
## giniz                 0.0752 0.1055    -0.1448   0.0785     0.2757  0.0838
## zpblack              -0.0960 0.2227    -0.5681  -0.0877     0.3344 -0.0740
## zphisp               -0.0078 0.0693    -0.1534  -0.0053     0.1250 -0.0010
##                      kld
## (Intercept)            0
## race_ethhispanic       0
## race_ethnh black       0
## race_ethnh multirace   0
## race_ethnh other       0
## agec(24,39             0
## agec(39,59             0
## agec(59,79             0
## agec(79,99             0
## giniz                  0
## zpblack                0
## zphisp                 0
## 
## Random effects:
## Name   Model
##  struct   BYM model 
## 
## Model hyperparameters:
##                                             mean      sd 0.025quant
## Precision for struct (iid component)      153.85  223.49      13.51
## Precision for struct (spatial component) 2450.82 2819.58     171.57
##                                          0.5quant 0.975quant   mode
## Precision for struct (iid component)        87.86     699.83  34.26
## Precision for struct (spatial component)  1595.63    9844.56 464.10
## 
## Expected number of effective parameters(std dev): 13.92(0.6688)
## Number of equivalent replicates : 603.12 
## 
## Marginal log-Likelihood:  -5010.08 
## Posterior marginals for linear predictor and fitted values computed
1/fit_in3$summary.hyperpar
##                                                  mean           sd
## Precision for struct (iid component)     0.0064998242 0.0044745170
## Precision for struct (spatial component) 0.0004080259 0.0003546625
##                                           0.025quant     0.5quant
## Precision for struct (iid component)     0.074045321 0.0113817337
## Precision for struct (spatial component) 0.005828483 0.0006267101
##                                            0.975quant        mode
## Precision for struct (iid component)     0.0014289183 0.029186839
## Precision for struct (spatial component) 0.0001015789 0.002154692
m3_sp<- inla.tmarginal(
        function(x) (1/x),
        fit_in3$marginals.hyperpar$`Precision for struct (spatial component)`)
m3_iid<- inla.tmarginal(
        function(x) (1/x),
        fit_in3$marginals.hyperpar$`Precision for struct (iid component)`)

inla.hpdmarginal(.95, marginal=m3_sp)
##                      low        high
## level:0.95 0.00003208837 0.003969693
inla.hpdmarginal(.95, marginal=m3_iid)
##                     low       high
## level:0.95 0.0003350537 0.05414296
plot(m3_sp, type="l", main=c("Posterior distibution for between Spatial MSA variance", "- Multi-level model -"), xlim=c(0, .015))
lines(m3_iid, col=2,lty=2)
legend("topright", legend=c("Spatial Variance", "IID Variance"), col=c(1,2), lty=c(1,2))

References

Besag, J., York, J., & Mollie, a. (1991). Bayesian image-restoration, with 2 applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43(1), 1-20. https://doi.org/10.1007/BF00116466